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A NECESSARY AND SUFFICIENT CONDITION FOR EDGE UNIVERSALITY 

OF WIGNER MATRICES 

JI OON LEE AND JUN YIN 

Abstract. In this paper, wc prove a necessary and sufficient condition for Tracy-Widom law of 
Wigner matrices. Consider N X N symmetric Wigner matrices H with Hij = N~ 1 / 2 Xij , whose 
upper right entries Xij (1 < i < j < N) are i.i.d. random variables with distribution fi and diagonal 
entries xu (1 < i < N) are i.i.d. random variables with distribution Jl. The means of fi and Jl are 
zero, the variance of fi is 1, and the variance of Jl is finite. We prove that Tracy-Widom law holds if 
and only if linis-j-oo s 4 P(|xi2| > s) = 0. The same criterion holds for Hcrmitian Wigner matrices. 



1. Background and Main result 

Since the groundbreaking work by Wigner [53] . it has been conjectured and widely believed that 
local statics of eigenvalues of random matrices are universal in the sense that it depends only on the 
symmetric class of the ensembles. The universality is one of the most important concepts in random 
matrix theory, and it can roughly be divided into two different types, the bulk universality and the 
edge universality. 

Before considering the edge universality, which we will study in this paper, wc roughly introduce 
some important results on bulk universality. The bulk universality concerns the local statistics of 
eigenvalues in the interior of the spectrum. In the early works of Wigner, Dyson, Gaudin, and Mehta 
[35J OH HU [T^] , it was proved that, after proper rescaling, the joint probability density of eigenvalues of 
Gaussian Unitary Ensemblc(GUE) can be explicitly described by the sine kernel, and they conjectured 
the universallity holds for more general classes of ensembles. For a very general class of invariant 
ensembles, the bulk universality was proved by Deift et. al. [321 [14], Bleher and Its [8], and Pastur 
and Shcherbina [40] . Later by Johansson [29] , the bulk universality was proved for Gaussian divisible 
ensembles. (See also the work by Ben Arous and Peche [6].) For general Wigner matrices, a new 
approach was introduced to prove the bulk universality in a scries of papers by Erdos, Schlcin, Yau, 
and others in [17j [HI HH [20l HU [24j [25J HH [15l [16] . The bulk university for Wigner matrices was also 
obtained by Tao and Vu [50]. See the reviews [22l [23] for further discussion. 

The distribution of the largest eigenvalue exhibits another type of universality, which is called the 
edge universality. Let An be the largest eigenvalue of a Wigner matrix. For the Gaussian ensembles, 
the distribution function of A at was first identified by Tracy and Widom [331 [35] . More precisely, it is 
proved that 

Urn V(N 2 ^(X N - 2) < a) = F (s), (1.1) 

N— yoo 

where the Tracy-Widom distribution functions Fp can be discribed by Painleve equations, and j3 = 
1,2,4 corresponds to Orthogonal/Unitary/Symplectic ensemble, respectively. The joint distribution 
of k largest eigenvalues can be expressed in terms of the Airy kernel, which was shown by Forrester 
|27j . If we denote the k largest eigenvalues by Ajv, Ajv-i, • • • , Ajv-fc+i, then for Gaussian ensembles, 
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the joint distribution function of rescalcd eigenvalues has the limit 

lim p(n 2 / 3 (\ n -2)<s 1 ,N 2 / 3 (\ n _ 1 -2)<s 2 ,--- , N 2 / 3 (\ N _ k+1 - 2) < s k ) 
N^x \ ) (1.2) 

= Fp i k(si,82, • • ' , Sfe), 

which will also be called the Tracy- Widom distribution. 

The condition for (|1.2j) has been studied intensively. In the direction of sufficient condition, it has 
been improved as follows: The result (|1.2j) was first extended to general Wigner matrices by Soshnikov 
|47j with the condition that all odd moments of matrix entries vanish (e.g. the symmetric distribution) 
and with the Gaussian decay. Ruzmaikina [43] showed that the Gaussian decay can be replaced with 
polynomial decay faster then x~ 18 . Also, under the condition that matrix entries are symmetrically 
distributed, Khorunzhiy [32] proved a bound for the spectral norm for the matrices whose entries have 
finite 12 + o(l) moment. For the non-symmetric case, (|1.2p is proved in [5T] by Tao and Vu with 
the condition that matrix entries have vanishing third moment and sub-expontential decay. (Some 
partial results in the non-symmetric case can be found in [38] and |39j.) Later, the vanishing third 
moment condition was removed by Erdos, Yau, and others in |26i 116) . i.e., (|1.2[) is implied by the 
sub-expontential decay condition. The current best sufficient condition for (|1.2j) . as we know, is that 
the matrix entries have finite 12 + o(l) moments, which was proved in |16| . Numerical results by 
Biroli, Bouchaud, and Potters [7] predicted that the Tracy- Widom distribution would appear when 
the (4 + e)-th moment is finite. 

On the other hand, for the Wigner matrices whose entries have heavy tails, the necessary condition 
for (|1.2j) is studied as follows: In the case of real symmetric matrices with i.i.d. entries, it was proved by 
Soshnikov [35] that, when the variance of entries diverges, the largest eigenvalue has Poisson statistics. 
More precisely, in [35] was considered the case where the distribution of entries satisfies 

F(\h ij \>x) = ^, (1.3) 

where h(x) is a slowly varying function and a < 2. The case 2 < a < 4 was later studied by Auffinger, 
Ben Arous, and Peche [JJ, which also shows the Poisson statistics. We also remark that in the case 
a < 2 the Wigner semi-circle law no longer holds in the bulk. See the work by Ben Arous and Guionnet 
[5] for more detail. The numerical simulation results in [7] also suggest that a = 4 in (|1.3[) will provide 
the marginal case. 

The edge universality has been generalized in many directions, for example, for the sample covariance 
matrices [25J [3U HH HU E] and for correlation matrix [3] [32]. For the deformed matrices, which 
are described as a finite rank perturbation of sample covariance matrices and the deformed Wigner 
matrices, the Tracy- Widom law also holds when the outliers are excluded [31 151 [TU1 |3"T1 I3"3] . 

In this paper, we prove the following simple criterion on this property: The necessary and sufficient 
condition for the joint probability density of the k largest eigenvalues of a Wigner matrix (see definition 
in Dcf. II. ip to weakly converge to that of Gaussian ensembles, i.e., the Tracy- Widom distribution, is 
that the off-diagonal entry of the Wigner matrix satisfies 

lim s 4 P(|xi 2 | > s) = 0. (1.4) 

S— >-\-OG 

We note that this criterion has not been predicted in any previous works. 

The precise definition of the Wigner matrix we consider in this paper is as follows: 

Definition 1.1. The (standard) symmetric (Hermitian) Wigner matrix of size N is a symmetric 
(Hermitian) matrix 

{H N )ij = hij = —r=Xij, 1 < i,j < N, 



/N 

where the upper-triangle entries (xij) (1 < i < j < N) are independent real (complex) random 
variables with mean zero satisfying the following conditions: 
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• The upper right entries Xij (1 < i < j < N) are i.i.d. random variables with distribution jj,, 
satisfying Kx\2 = and E|a;i2| 2 = 1. 

• The diagonal entries xn (1 < i < N) are i.i.d. random variables with distribution Jl, satisfying 
Exu = and E|xn| 2 < oo. 

• In addition, for the Hcrmitian case, E(x 12 ) 2 = 0. 

When the random variables Xij and Xu are real Gaussian with E|xii| 2 = 2, H will be called Gaussian 
Orthogonal Ensemble (GOE). Similarly, when Sy are complex Gaussian and xa are real Gaussian with 
E|:£m| 2 = 1, H will be called Gaussian Unitary Ensemble (GUE). We denote by Ai < A2 ■ ■ • < \n the 
eigenvalues of Hn and by ui, 112 • • • ujv the corresponding eigenvectors of Hn- 

The main result of this paper is the following theorem: 

Theorem 1.2. For any centered distribution v andv with variance 1 and finite variance, respectively, 
let Hn be the Wigner matrix defined in Definition \l.l\ such that x\2 and x\\ have distributions v and 
v, respectively. Then, 

• Sufficient condition: if (|1.4j) holds, then for any fixed k, the joint distribution function of k 
rescaled largest eigenvalues, 

P (iV 2 / 3 (Ajv - 2) < si, iV 2/3 (Ajv-i - 2) < s 2 , ■ ■ ■ , N 2 / 3 (\ N - k +i - 2) < * fc ) (1.5) 

has a limit as N — > 00, which coincides with that in the GUE (GOE) case, i.e., it weakly 
converges to the Tracy- Widom distribution. 

• Necessary condition: if (| 1 -4[) does not hold, then the joint distribution function (|1.5[) does not 
converge to the Tracy- Widom distribution. Furthermore, we have 

limsup P{X N {H N ) > 3) > 0. (1.6) 

N-too 

Remark 1.3. While any distribution with finite fourth moment satisfies the criterion (|1.4|) . the converse 
is not true. If we consider, for example, the distribution whose density f{x) decays as |a;| -5 log \x\, 
then it does not have finite fourth moment though (|1.4p holds for it. The existence of this particular 
example, however, does not contradict the result in [2], which proved that lini/v->oo Ajy(-ffjv) = 2 a.s. 
if and only if the fourth moment exists. 

Our result provides a very simple sufficient and necessary condition for the edge universality of 
Wigner matrices without assuming any other properties of matrix entries. This also shows the existence 
of four moments, which was predicted to be needed for the edge universality, is not necessary for the 
Tracy- Widom result, as we can see from Remark [T31 

Our proof of the main result features two key observations. 

1. If we introduce a 'cutoff' on each matrix element at N~ e , then the matrix with the cutoff can 
well approximate the original matrix in terms of the behavior of the largest eigenvalue if and only if 
the criterion (|1 .4[) holds. 

2. The Green function comparison method (e.g. Theorem 6.3 in [26]), which was first introduced 
in [24] . can be extended to the random matrices whose entries have a bounded support of size N~ c 
for some e > 0. The Green function comparison method was applied on studying the distribution of 
the eigenvalues of the Wigner matrices, deformed Wigner matrices, covariance matrices, correlation 
matrices, and adjacency matrices of random graphs [24l [25j [26l [3H [4TJ |42l [15j [16]. It was also used 
in the study of the distribution of eigenvectors [33] and the determinant [52] of Wigner matrices. We 
believe that our new method in the present paper can be used to improve the results in these topics. 

The first observation can be understood in the framework of the deformed Wigner matrix. We 
consider the matrix with the cutoff as the unperturbed part and the remaining part the perturbation. 
As studied in [3] [37] [9] [TOJ [34] , if the perturbation is small enough, then we can predict the behaviors 
of the largest eigenvalues of the original matrix from the matrix with cutoff. On the other hand, if the 
original/perturbation matrix has an entry whose absolute value is larger than 1, then the matrix will 
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have an eigenvalue greater than 2, hence the Tracy- Widom distribution fails. Roughly speaking, the 
criterion (|1.4|) means that each off-diagonal entry is bounded by 1 with probability 1 — o(N~ 2 ), thus 
the condition gaurantees that no entries are larger than 1 with probability 1 — o(l). We remark that 
a similar argument was introduced in [7]. 

The Green function comparison part is more technical. Given the matrix with the cutoff at iV _e , 
we first find a 'better' matrix, in the sense that it is already known to satisfy Tracy- Widom law, whose 
first four moments coincide those of the given matrix. We then apply Lindcberg replacement strctagy 
sufficiently many times (more precisely, 0(e~ 1 ) times) to compare the Green functions. The basic idea 
is as follows: Using Green function comparison method, one can study the difference of the functional 
on Green functions between the 'better' matrix and the original matrix. Instead of bounding the 
difference directly, however, we represent it as a new functional, which is much more complicated, 
on Green functions, with gaining a factor N~ e . This new functional can be easily bounded for the 
'better' matrix case, but not for original matrix. To solve this issue, again we use Green function 
comparison method to estimate the difference of this new functional between the 'better' matrix and 
original matrix. Repeating this process, we obtain the desired bound. The details will be explained 
later. 

Though the Green function comparison method has been used in previous papers, it was always 
required to have a good bound on Green function with high probability. This is one of the reasons 
that the distribution of matrix entries have been assumed to satisfy subcxponential decay condition in 
many papers. In this paper, however, we show a way to circumvent this problem, which can be used 
to achieve many other results, besides the edge universality, for heavy-tailed random matrices. See, 
for example, the rigidity result in Theorem 13.61 that holds for the random matrices whose entries are 
only bounded by N~ e for some e > 0. (Note: it is also an interesting result, since it shows that the 
locations and the fluctuations of the eigenvalues keep unchanged, even if the fluctuations of the matrix 
entries become very large, i.e., from iV -1 / 2 to N~ e . ) 

This paper is organized as follows. In Sections [2] and [3l we introduce the notations and collect tools 
we use to prove the main result. In Section 21 we prove the main result using the cutoff argument. 
Technical results on the Green function comparison method will be proved in Sections [5] and El 

Remark 1.4. In this paper, for simplicity, we will prove Theorem 11.21 only for the real symmetric case 
with k = 1. The general case can be proved analogously. 

2. Notations 

In the proof, we will use some variations of standard Wigner matrix defined in Definition ll.il which 
are defined as follows: 

Definition 2.1 (Generalized symmetric Wigner matrix). A symmetric matrix Hm is said to be a 
generalized symmetric Wigner matrix of size N if its upper-triangular entries 

(H lf )ij = hi i , l<i<j<N, 

are independent real random variables with mean zero, whose distribution may depend on i,j and N, 
and satisfy, for some constant Co, 

lE^.-iV" 1 ! <C iV-%-. (2.1) 

Remark 2.2. The results on generalized Wigner matrices, especially the constants in the results, may 
depend on Co, but we will not emphasize it in the sequel. 

As in [15[ I16j , we will use the following definition to characterize events of very high probability. 



Definition 2.3 (High probability events). Define 

<p := (logA0 loglog 



(2.2) 
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We say that an TV-dependent event Q holds with £-high probability if there exist constants c, C > 0, 
independent of N, such that 

P(fi) > 1 - N c e- cvC (2.3) 

for all sufficiently large N. For simplicity, for the case £ = 1, we just say high probability. 

The next condition on the distributions of the matrix entries will be used in the proof. 

Definition 2.4 (Bounded support condition). We say a family of random matrices (Hpf)ij = {hij) 
satisfies the bounded support condition with q, if for 1 < i, j < N 

\hij\Kq~ 1 (2.4) 

with probability larger than 1 — e~ N for some c > 0. Here, q may depend on N and usually < 
q < ^^(logN)- 1 for some <j> > 0. 

Note that the Gaussian distribution satisfies bounded support condition with any q < for 
any <p < 1/2- We also remark that, when Hn satisfies the bounded support condition, the event 
{\hij\ < q^ 1 } holds with 'very' high probability, i.e., it holds with £-high probability for any positive 
constant £. For this reason, the extreme event {|/%'| > q 1 } is negligible, and throughout the paper, 
we will not consider the case it happens. 

Definition 2.5 (Green function, semicircle, m sc and m). For a Wigner matrix H, we define the Green 
function of H by 

G tj (z) := yjf—^) > z = E + «7i EeR, n > 0. (2.5) 
The Stieltjes transform of the empirical eigenvalue distribution of H is given by 



m(z) = m N (z) := ^Y. G ^ = ^ Tr Jf^ ■ ^ 

3 

Define m sc (z) as the unique solution of 



m sc {z) + 1 = 0, (2.7) 
z + m sc (z) 



with positive imaginary part for all z with Im z > 0, i.e., 



-z + Vz 2 - 4 

™ S c{z) = 1 , (2.8) 

where the square root function is chosen with a branch cut in the segment [—2,2] so that asymptotically 
\J z 2 — 4 ~ z at infinity. This guarantees that the imaginary part of m sc is non-negative for r\ = Imz > 
and in the 77 — >■ limit it is the Wigner semicircle distribution 

I. ,„ . , 1 



Q ac (E):= lim -Im m sc (E + irj) = — y/ (A - E 2 )+. (2.9) 

t7->0+0 7T Z7T 

We will also frequently use the notations 

z:=E + irj, k:=||£'|-2|. 

The following lemma (Lemma 4.2 of [5S]) collects elementary properties of the Stieljes transform of 
the semicircle law. As a technical note, we use the notation / ~ g for two positive functions in some 
domain D if there exists a positive universal constant C such that C _1 < f(z)/g(z) < C holds for all 
zED. 
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Lemma 2.6. We have for all z with Im z > that 

\m sc {z)\ = \m sc {z) + z^ 1 < 1. (2.10) 
Let z = E + in with \E\ < 5 and < 77 < 10. We have 

|m ac (z)|~l, |1 + (2.11) 

and the following two bounds: 

if n>n and \E\ > 2, 



Im m sc (z) ~ <( (2-12) 
s/K + rj if k < 77 or \E\ < 2. 

Definition 2.7 (Classical location of the eigenvalue). We denote by jj the classical location of the 
j-th eigenvalue, i.e., jj is defined by 

N g sc (x)dx =j, l<j< N. (2.13) 



Remark 2.8. Throughout the paper, the notations O(-), o(-), and <C will always be with respect to the 
limit TV — > 00, where a <C b means a = o(b). The constant C will denote various constants independent 
of N, 

3. Tools 

In this section, we introduce some results that will be used in the proof of the main theorem. Some 
of them are already proved in previous papers with H.-T. Yau, L. Erdos, and A. Knowles, and we 
made slight changes in the statement to fit the notations and definitions in this paper. We also extend 
some of the known results. 

Define the domain 

S(C) = jz = E + ir, : \E\ < 5, N^ip < 77 < 10 j. (3.1) 

Lemma 3.1 (Previous results on generalized Wigner matrix). Let H be a generalized Wigner matrix 
satisfying bounded support condition with q. There exists a constant C > such that, if q > (p , then 
the following properties hold with 3-high probability: 

(1) Local semicircle law (Theorem 2.8 in [H]^ .' 



2 U c) {k*) - * * c ( min { 7^?' 1} + ik) } (3 - 2) 

[J < max\Gij(z) - %m sc (z)| < ip c ( - + 
zes(C) I 13 \ q 



(2) Bound on (Lemma 4-4 * n UM) '■ 



'Imm sc (z) 1 
N-q + Nr] 



\\H\\ < 2 + <p c (q- 2 +N-' 2/3 ) (3.4) 
(3) Delocalization (Remark 2.18 in [15]j : 

max|u a («)| 2 < — (3.5) 

a.i iV 

Furthermore, if q > for some constant <f> > 1/3, then the following properties hold with 3-high 
probability: 
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(4) Rigidity of the eigenvalues (Theorem 2.13 and Remarks 2.14, 2.15 in [15],) : 

U{|Aj -Til<V C ^[min(j,7V_ J + i)] _1/3 Ar- 2 / 3 + (Z - 2 ^ j. (3.6) 

(5) Bound on out of Spectrum (Equations (3.32), (3.58) and (4.36)-(446) in [Hj; : For 
any large a > 12, there exists a constant C > depending on a such that for z = E + in with 

2 + Lp c N~ 2/3 < E < 3, 7] = ip^+^N^K' 1 ^, (3.7) 

we have 

1 

ip a Nn' """"W - (pa^' - ^a/SJsT^ 



\rn(z) - m sc (z)| < — — , Imm(z)<— — , max|GV,-| < 377^- (3.8) 



Proof of Lemma [3~J\ For the case Co = (see (|2.1[1 ), these results except p.8[) were already proved 
with the choice of £ = 3 log log JV in [15] . Furthermore, the proofs in [15] can be extended to the case 
Co = 0(N^ 1 ) with almost no revision. Hcuristically speaking, it only brings the error of order N . 
As we can see from the proofs, these inequalities still hold after multiplying G, A, and u by a factor of 
l + OiN" 1 ). 

In order to prove (|3.8p . we choose £ = 3 log log TV as above and let C\ — 1 in (2.15) of [TS] so that 
(logA0 Cl « = ip 3 - In (4.36)-(4.46) of [15], it was actually proved that (see (4.45) and (4.46) of [15]), 
with 3-high probability, 

\m(z) — m sc (z)\ < — , Imm(z) < 



maxIG^I < Oiq- 1 ) + ,. f „ /2 ,_ 2Ar _ (3.9) 



(log N)Nn ' w ~ (log N)Nr] ' 

where A := |m — m sc \ and a ~ -^/k in [15]. Hence, we only need to change the exponent to obtain 
the first two parts of ([3~8]) . To achieve that, one can replace ip 3 (\ogN) 2 in (4.37) of [15] with (p 3+2a , 
and replace <p 3 (log JV) in (4.38), (4.42), and the inequality below (4.44) of [H] with tp 3+a . Then, as in 
(4.40), (4.45), and (4.46) of [TS], we obtain the first two terms of ([3~8]) 

Now we prove the third part of (|3.8|) . Using (3.32) and (3.58) of [15], we have that with 3-high 
probability, 

1 

From (|3.7|) . we have r\ > ip ^ N~ 2 / 3 . Together with assumptions on q and (|3.9[) . we obtain the third 
part of (|3.8[) and complete the proof. 

□ 

Remark 3.2. In [TS], the author proved that, for any fixed (, there exists Cq such that when C = Cq 
the above statements hold with C _ higfi probability. In this paper, we only use 1 < C < 3 in the proofs. 
We note that, however, similar results to Lemma \'6 . 1 1 holds with £-higli probability for some constant 
C = C c . 

Theorem 3.3 (Edge universality on generalized Wigner matrix: Theorem 2.7 in [IS]). Let £f w be a 

GOE and H v a generalized symmetric Wigner matrix with 

E v \h t] \ 2 ^N- 1 +S lJ N- 1 . 

Assume that H v satisfies the bounded support condition with q = , for some constant 1/3 < < 1/2. 
Then, there exists a constant 6 > such that, for any s € K, we have 

P w (A 2/3 (Aat-2) < s-N- s ^j-N- s < P v (JV 2/3 (Aat-2) < s) < P w (jV 2/3 (Aat - 2) < s + JV^) +N~ S . 

(3.10) 

Here, P v and P w denote the laws of the ensembles H v and H w , respectively. 
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Remark 3.4. As in [26] and [16], Theorem 13.31 as well as Lemma 13751 and Theorem 13.71 below, can be 
extended to finite correlation functions of extreme eigenvalues. For example, we have the following 
extension: 

P w (V/3 (AAr _ 2 ) < 8l _ N~ e , . . . , N 2 / 3 (X N . k - 2) < s k+1 - N-^ - N- s 

< F v (n 2 / 3 (\ n - 2) < s u . . . ,iV 2 / 3 (Ajv- fe - 2) < Sfc+i) (3.11) 

< P w (iV 2 / 3 (Aiv - 2) < si + N~ e , N 2 / 3 (\ N _ k - 2) < s fc+1 + N-^ + N~ 5 

for all k fixed and TV sufficiently large. The proof of p. lip is similar to that of (|3.10l) except that it 
uses the general form of the Green function comparison theorem. 

We slightly extend this result as follows: 

Lemma 3.5. Assume the same condition as in Theorem \3.3\ excevt that we assume W\hu\ 2 < Cq/N 
for some uniform Co instead. Then, the conclusion of Theorem \3.3\ still holds. 



We postpone the proof of this lemma to the end of this section. 

To prove our main result, we claim the following three lemmas, which extend the previous results to 
the Wigner matrix with bounded support condition of small q. First, wc improve the previous result 
on rigidity. We define the normalized empirical counting function by 

n(E) := 1#{A, < E). 

Let 

i-E 



n sc (E) := / g sc (x)dx 

J — oo 

be the distribution function of the semicircle law. 

Theorem 3.6 (Rigidity of eigenvalues: small q case). Let H be a generalized symmetric Wigner matrix 
with some constant C\ such that for any 1 < i < j < N 

E\h u \ 2 <d/N, E|%| 2 = l/iV, E|/i y | 3 < d^- 3/2 , E\h qj \ 4 < Ci(logiV)iV- 2 

and H satisfies the bounded support condition with q = for some constant <f> > 0. Then, there exist 
constants C > and Nq, depending only on C\ and (f>, such that with high probability we have 

U {l A ^ 7j -I < f° [min ( j, N - j + 1 )] ~ V V 2 / 3 j (3.12) 



and 



sup \n(E) - n 8C (E) \< Z- (3.13) 

\E\<5 iV 



for any N > No. 

Next theorem shows that the edge universality holds under the assumptions in Theorem 

Theorem 3.7 (Edge universality: small q case). Let -ff w be a GOE and H v be a generalized symmetric 
Wigner matrix satisfying the conditions for H in Theorem Iff. 61 Then, there exists a constant 5 > 
such that for any s € R, we have 

P w (V 2/3 (Aat-2) < s-N- s ^j-N- s < ¥ v (N 2/3 {X N -2) < s) < F w ^N 2/3 (\ N -2) < s + N' 6 ^ +N~ S . 

(3.14) 

Here, P v and P w denote the laws of the ensembles H v and , respectively. 

Finally, we show a weak bound on Gy (i ^ j) of H satisfying the conditions in Theorem 
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Lemma 3.8 (Bounds on Gif. small q case). Let H be a generalized symmetric Wigner matrix satisfying 
the conditions for H in Theorem \3.6[ Then, for anyO < c < 1, z = E+irj with \E\ < 5, andrj > N~ 1+c , 
we have the following weak bound on Gij (i ^= j): 

c /Imm sc (z) 1 



In the remainder of this section, we give the proof of Lemma 



(3.15) 



Proof of Lemma \3.5\ This lemma is a simple extension of Theorem 2.7 of [16] . Thus, from the proof 
of Theorem 2.7 in [16j . we find that it suffices to prove the following claim, which corresponds to 
Proposition 6.6 of [16] : 

Claim. Let F : K — > M. be a function whose derivatives satisfy 

sup|F (n) (x)|(l + MT Cl < O u n= 1,2,3,4, (3.16) 

X 

with some constant C\ > 0. Then, there exists a constant e > 0, depending only on Ci, such that, for 
any e < Z and for any real numbers 

E,E U E 2 e {x : \x - 2| < 7V- 2 / 3+£ } =: 7 e , 

and setting 77 := 7V _2 / 3_e , we have 



i v F (iVr? Im to(£' + in)) - E W F (7V?7 Im m(E + irj)) 



(3.17) 
(3.18) 



and 



E V F N 



/E2 \ / pE2 

dylmm{y + irf)\ WF ( A y 



dz/ Im m(y + in) 



< iv 1/3+Ce 



(3.19) 



for some C and for any sufficiently large TV. 

We only prove (|3.18[) , and (|3.19[) can be proved similarly. In order to prove the claim, we only need 
to prove 



EF r? 2 £ GJ^ - EF U 2 £ G^GJ 



if-} 



< CN^+^q- 1 . 



(See the proof of Theorem 6.3 of [25] for more detail.) 

Fix a bijective ordering map on the index set of the independent matrix elements, 

$ : {(k,£) : 1 < k < £ < N} ->{!,••• , 7max = N{N + 1] }, 



(3.20) 



(3.21) 



and let H 7 be the Wigner matrix whose matrix elements hki follows the distribution /i y f if &(k, £) < 7 
and the distribution h™ e otherwise. In particular, Ho = ff w and -ff 7max = H v . (Note that the index 
used here is slightly different from previous papers.) Since the Gaussian distribution satisfies the 
bounded support condition with q, we remark that H 1 satisfies bounded support condition with q for 
any 7 = 0, 1, • ■ • ,7 max . 
For simplicity, let 

1 



en 



Hry — Z 



(3.22) 



Note that matrices H 1 and iJ 7 _i differ only at (a, b) and (b, a) elements, where $(a, b) = 7. Let 
v a b '■= h^ b and w a b '■— h™ b . We define matrices V and W by 

Vkt := SakSbiVab + SaiSbkVba, W k i S ak 8b£W a b + 8 a iSbkW ba , (3.23) 

so that we can rewrite iJ 7 and i? 7 -i as 

H 1 = Q + V, iT 7 _i = Q + W, (3.24) 
with a matrix Q satisfying Q a b = Qba = 0. 
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Define Green's functions 



11 1 . 
R:=- , S:=- , T:=- . 3.25 

Q — z H 1 — z n 1 -\ — z 



Note that we have a priori estimates 



max max | S M (E + irj) - 6 u m sc (E + irf)\ < CN~ 1/3+2e (3.26) 

from part (1) of Lemma l3~Tl and 

maxmax \R M (E + irj) - 5 M m sc (E + ir})\ < CN- 1/3+2e (3.27) 

with high probability. To see (|3.27p . we first expand Rke using the resolvent expansion 

R = S + SVS + (SV) 2 S + (SV) 3 S + {SV) 4 S + (SVfR. (3.28) 

Since V has at most two non-zero entries, each term in the expansion can be written as a sum of 
finitely many terms consisting of the entries of S, V, and R. From the bound (|3.26p . the fact that 
v a b satisfies bounded support condition with q, and the trivial bound Rij < rj^ 1 < N, we obtain the 
estimate (|3 . 2T|) . 

When a ^ b, from the proof of Proposition 6.6 in |16j we have that 



< CN- h ' 3+Ce q- x . (3.29) 



Consider the case a — b. Using the resolvent expansion 

S = R- RVR+{RV) 2 S, (3.30) 

we find that 

SijSij 



RijRij RiaVaaRajRij Rij RiaVaa-Raj ~t~ Ria* aa-Raj Ria^aaRaj 



+ [(RV) 2 S\ijRtj - [(RVySlijRiaVaaRaj + Rij[(RV) 2 S\ij - R ia V aa R a3 [(RVyS}. l3 + |[(i?l/) 2 Sy 2 . 

(3.31) 

Note that V aa < q^ 1 with high probability. When i, j ^ a, we have from the estimates (|3.26p and 
(j3~27l) that 

\StjS~- R tl R~~\ < CN-'+^q- 1 (3.32) 

with high probability. Let 

When i = a or j = a, we have one less off-diagonal entries of R or S in the expansion (|3.31[) , but there 
are only 0(N) such terms. Thus, we obtain with high probability that 

\y S - y R \ < N-^^q- 1 . (3.34) 

Consider the Taylor expansion 

F(y S ) - F(y R ) = F'(y R )(y s - y R ) + V'(C)(/ ~ y R f (3.35) 

for some £, which lies between y s and y R . Since y s < N 2e and y R < N 2e with high probability as we 
can see from the bounds (|3.26p and (|3.27p . |-P"'(C)I < N Ce with high probability from the assumption. 
Thus, we obtain 

\E(F(y s ) - F{y R )) | < \E(F'(y R )E aa (y s - y R )) \ + CN- 2 ^ +c 'q- 2 . (3.36) 
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For the first term of right hand side, we use (|3.31[) and the fact: R is independent of V and E,V aa = 
and bound this term as 0(N~ 1 / 3+4e q~ 2 ). Therefore, 



|E(F(/) - F(y R )) \ < 0(N- 1/3+4t q- 2 ). 
Note that we can get the same estimate if we put T in place of S. Hence, we find that 



EF ( r, 2 SvSn - EF irf £ T l3 T t , 



< CN-^+^a- 2 . 



(3.37) 



(3.38) 



We write the quantity in the left hand side of (|3 . 20[) as a telescopic sum, 



= E »4/ 2 E s n s v - EF h 2 E T « r « 
7=1 V V ^ / V 

(3.39) 

Since the number of summands in the right hand side of (|3.39j) with a ^ b is 0{N 2 ) and the number 
of summands with a — b is O(N), we find that (|3.20|) holds from (|3.29j) and (|3.38|1 . This proves the 
claim, which implies the desired lemma. □ 



4. Proof of the main result 



In this section, we prove the main result, Theorem 11.21 Let Hn be a Wigner matrix defined as in 
Definition 1 1 . 1 1 such that x±2 and x\\ have distributions v and T>, respectively. We begin by proving the 
second part of the main result. 

Proof of the main result: Necessary condition. Assume that lim s _ ! . 00 s 4 P(|xi2 1 > s) 7^ 0. We note that 
there exists a constant < c\ < 1/2 and a sequence r±,r2, ■ ■ • such that r n — > 00 as n — > 00 and 

P(|zi 2 | >r k ) > Cl r^ 4 . (4.1) 

Consider an event 

T N := { There exist i and j, 1 < i < j < N, such that \hij\ > 4, \hu\ < 1, and \hjj\ < 1}. (4.2) 
We first show that, when rV holds, An(Hn) > 3. Define u <E M. N through 

!1/V2 i£m = i, 

(1/V2) ■ sgn(^) if m = j, (4.3) 
otherwise . 

Here, sgn(/iy) := \hij\/hij. Since 1 1 u 1 1 2 = 1, it can be easily seen that 

\n(H n ) > (u, H N u) = hulu,] 2 + hjj\uj\ 2 + hijUiUj + // ; ,u ; u, = + — + hn > 3. (4.4) 

We now prove that there exists a constant cq > 0, independent of N, such that P(IV) > co for any 
N £ {|> fe /4) 2 J : k £ N} with N > 2E|xn| 2 . Note that it implies Define an event 

T^v := { There exist i and j, 1 < i < j < N, such that > 4}. (4.5) 
Clearly, if N = |_( r fc/4) 2 J , then we have that 

1 - P(r' w ) < (1 - P(M > 4 ) ) JV(Af - 1)/2 < (1 - F(|x 12 | > r k )) N ^ /2 

< (1 - ^Y^ 2 < (1 " c 2 N~ 2 f^' 2 ^ 
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for some constant < C2 < 1, independent of N. Since (1 — C2N~ 2 ) NI - N ~ 1 ^ 2 < C3 for some constant 
< C3 < 1, independent of N, we find that P(r' Ar ) > 1 — C3. Suppose that Tn holds with \hi>j>\ > 4 
for some indices i' and j' (1 < i' < j' < N). From Markov inequality, we have 

F(\h iH >\ > 1) = P(|x w | > VN) < N^Elx^l 2 < 1 (4.7) 

and P(|ft.j'j'| > 1) < 1/2 as well. Since the diagonal elements /i^v and hyy are independent to each 
other, and the event Pflhjvl < 1; l^j'j'l < 1) is indepedent from T' N , we find that 

p(r Ar )>p(r' Ar )(i/2) 2 >i(i- C3 ). (4.8) 

This completes the proof. □ 

Next, we prove that, if (|1.4p holds, then 

f(n 2 / 3 (\ n -2) < s-N- 5 ^j -N~ s < F V (N 2 / 3 {X N -2) < s) < ¥^N 2 / 3 (\ N -2) < s + AT 5 ) +N~ S , 

(4.9) 

where A at denotes the largest eigenvalue of GOE. 

If Hn satisfies the assumption in Theorcm l3.6l then (|4.9[) indeed holds as we have seen from Theorem 
13.71 Thus, we construct from Hn a random matrix H satisfying the bounded support condition and 
the moment condition in Theorem 13.61 Comparing the largest eigenvalue of H s with that of -Hat, we 
will show that the difference between them will be negligible with probability 1 — o(l). 

Proof of the main result: Sufficient condition. For fixed e > 0, any N , define 

a:=a N :=p(|.t 12 | > N 1 / 2 ^ , a := a N := P (\x n \ > N 1 ^ , (4.10) 

lf|xia| >^ 1/2 - e )x 12 l , P:=p N :=E\l(\x 11 \>N 1 / 2 - e )x u ]. (4.11) 



P := P N := E 

By (|1.4[) and integration by parts, it implies that for any S and large enough iV, 



a 



< 6N- 2+4e , a < 5N- 1+2e , (4.12) 

\p\ < 8N~ 3 ' 2+3e , \p\ < SN-^ 2+e . (4.13) 
Let us, v l , VSi vl have the distribution densities: 

Pis(x*) Pi/(x*) 
PU S (X + P) = l\ x \< N l/2-e— , p VL {x+P) = l\ x \ >N l/2- t — , (4.14) 

PV S {X + P) = lu|<ATl/2- e — , pu L {x+P) = l| x | >JV l/2-e— . (4.15) 

1 — aN aN 
Here, the subindices S and L are for small and large, respectively. Let f c , v c be the distribution such 
that x = 1 with probability a and a, otherwise x = 0. 

Let H s , H L and H c be the random matrices such that H, s n = N~ 1 / 2 x? i , = N~ l / 2 xM, and 
= c i3 , where xfj, xfj, and Cij are independent random variables such that: 

(1) the entries xfj have distribution v$ if i 7^ j and xf have distribution vs, 

(2) the entries xfj have distribution vl if i 7^ j and xf have distribution 

(3) the entries have distribution v c if i 7^ j and Cjj have distribution I/ c . 
Clearly for independent H~, H-j and H^, we know 

~ Hf s (l - Hfj) + ///;//:; + iV" 1 / 2 /? + N' 1 ' 2 ^, (4.16) 

where the notation "~" denotes that the both sides have the same distribution. It is easy to see that 
the matrix M defined by 

= N^^p + N- 1 ' 2 ^ 
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satisfies that ||M|| < N 1+4e i which is negligible for small e, i.e., 



By (|f ,4p and integration by parts, we have for i =/= j, 



HfJl - Hg) + H^HC - H\\ < iV- 1+4e (4.17) 



=«)=0, E\xf i \ 2 = T ^—l (x + 0)*p v (x)dx = l-O(N- 1+ * e ), (4.18) 

l\x\<NV*-' 



1 — a 



E|4| 3 = 0(l), E|4-| 4 = 0(logJV), 



y 

and 

1-5 

We note that the matrix 



Exf l = 0, El^^r = -/ {x + 0) p v {x)dx = l-o{l). (4.19) 

| x |<iVl/2- e 



is a Wigncr matrix that satisfies the assumption of H v in Theorem 13. 71 (H in Thcorcm l3.6j) . Together 
with the fact E|xf 2 | 2 = 1 — 0(N~ 1+2e ), we find that there exists a S > such that, for any s £ M, we 
have 

P s ^ 2/3 (Aat-2) < s-N-^-N' 5 < F GOE (N 2/s (\ N -2) < s) < V s (n 2/3 (X n -2) < s+N-^+N' 6 . 

(4.20) 

where P is the law for H° . We write the first two terms in the right hand side of (|4.16[) as follows: 

HfjO- — II',, I + II 1 ,', ll',, = H-j + 1 1, j 1 1', j- II,, = II !, - IC, ■ 

We can see that H G - is independent of and Hij. Though depends on Hfj, from the condition 
p.4[) we know that for any i,j 

I > 3/4) < P(M > (1/2)7V 1 / 2 ) < o(N- 2 ) + oiN-^Sij. (4.21) 

Here, for the last inequality, we used (|1.4[) and that E|xii| 2 < oo. Because of this reason, instead of 
Hfj + HijHg, we only need to study the matrix whose entries have a cutoff on HijHg as follows: 

H^+DijH^, Dij = l SijH o< 3/4 H ij . (4.22) 

We note that 

P (max l-Dyflg - HijH$j\ = 0^ = 1 - o(l). (4.23) 

Furthermore, it is easy to see that we can introduce a cutoff on matrix H c such that: 

• The matrix with the cutoff, H c coincides with H with probability higher than 1 — o(l), i.e., 

¥(H C = H c ) > 1 - o(l) (4.24) 

• The number of non-zero entries are bounded by 

#{(M):iTg^0}<J\r 8£ (4.25) 

• If H% ^ and Hg + 0, then either {i,j} = {k, 1} or {i,j} n {fc, /} = 0. 
With (|4.24[) . we only need to study the largest eigenvalue of 

H s + E where E rj := D lj Hf j . 

We note maxy \Eij \ < 3/4, and the rank of E is less than N 5e . 

Let An and /Ltjv be the largest eigenvalue of H s and H s + E, respectively. We claim that 

p(|Ajv-Miv| <AT 3/4 ) =l-o(l). (4.26) 

From the claim P~2"b]) together with (|3~2"0"|) . P~M1) . (g^S) and ([437]) . we obtain the desired result ([O]) . 
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Now we prove (|4.26j) . Recall H c and H s are independent, i.e., the positions of the nonzero elements 
of E is independent of H s . Then by symmetry, we can assume that for some s,t < N 5e , among the 
matrix entries of E, only 

E12, E21, i?34, -E43 ' ' ' j E2s-l,2s, E2s,2s-1, -E^s+Os+l , E2s+2,2s+2 , ' ' ' E2s+t,2s+t 

are non-zero. Then, we can decompose the E as 

E = VDV* (4.27) 
where D is a (2s + t) x (2s + t) diagonal matrix and V is N x (2s + t). Furthermore, 

D = diag{£'i2, — .E12, -E34, —£34, ■ ■ - , ^s-l^sj — -E2s-l,2sj ^2s+l,2.s+l) -E'2s+2,2s+2 5 ' ' 1 ^2s+t.2s+t}- 

(4.28) 

If j = 2n — 1. n < s, then 

^ = £i,2n-l-y= + 5 i,2n4| (4.29) 

If j = 2n, n < s, then 

Vij = 5i.2n-\—j= - 5i,2n—j= (4.30) 

V2 v2 

If j = 2s + n, 1 < n < t, then 

Vij = 6ij (4.31) 

Note that E is symmetric matrix. Using Lemma 6.1 in [34] . we find that, if fi is the eigenvalue of 
H s + E, then 

det (V*G s (n)V + D- 1 ) = where G s (fi) = (4.32) 

Similarly, if we let [i 1 be the eigenvalue of H s + jE, then 

det (V* G s ( fi)V + hD)- 1 ) =0 where G s U) = . (4.33) 

v ' H° — \i 

Define (2s + t) x (2s + t) matrix F by 

F~< := := U*G s (m)U + (7D)" 1 . (4.34) 

Then, we have for the following 2x2 blocks of F 1 , 

r 2o-l, 2/5-1 r 2o-l,2^ 
^20, 2/3-1 r 2a,2fi 



1 A 1 ^ fGL-1,20-1 GL-i,2 \ f 1 M _ At^-uo)- 1 



r 2a-l, 2/3-1 LT 2o!-l,2^ \ [ x x \ , r 
2 V 1 -lAG 2 Vl Gf a ^^l -lj+^V -(7^ Q -l,2a)^ 

(4.35) 

where 1 < a, /3 < s. For the following 2x1 blocks of F, we get 

'%i')-*G -Ode/) 

where 1 < a < s, 2s + 1 < /3 < 2s + t. Finally, for the following lxl blocks of F, 

{Flo) ={Ga,*)+(lE a , a )- 1 (4.37) 

where 2s + 1 < a < 2s + t. 

Let z = 2 + iN~ 2 / 3 . From (|3"3j) and ([2TTT]) . with high probability, we have 

max |Gg(5) + l| <N-*'\ (4.38) 

l<z<2s+i 

For off-diagonal terms, with (|3.15[) and (|2.12p , we have that 

max < iV-Ve (4.39) 
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holds with probability 1 — o(l). Define 

Jl := A N ±N~ 3/4 . (4.40) 

From remark (|3.4|) and the fact that the largest eigenvalues of GOE are separated in the scale N~ 2 / 3 , 
we have 

P(A W - A w _i > 2iV" 3/4 ) > 1 -o(l), (4.41) 

thus 

P(min |A« - /x| > A^ 3/4 ) > 1 - o(l). (4.42) 

a 

With the distribution of A at (see (|3.14p ). we also have with probability 1 — o(l) that 

\Ji-2\ < N~ 2/3 ip c . (4.43) 

Consider the identity 

1 1 



|Gg.(2)-Gg.(/I)|=^K(iK(j)| 



Aq. — z A a (.i 

|A Q - SjlAa - ^1 



(4.44) 



From (|4.42p . (|4.43[) . the dclocalization in Lemma [3.11 and the rigidity result in Theorem 13.61 we have 
with probability 1 — o(l) that 

max\GfJz) - Gfjjl)\ < CN~ 1/4 . (4.45) 

i,j J J 

Combining (|435)l . (|4~3^)) . and (g^SJ), we find that (|435|) and (|4~3U| still hold with z replaced by Jl and 
the right hand side being doubled. From that maxi j \Eij\ < 3/4, together with (|4.35[) - (|4.37p , we have 
for any < 7 < 1 that 

min m(JT)\ >l- (9(iV" £ / 2 ), max\FZ(Jt)\ < CN^' & + l H _ jl=1 CN-^ 2 (4.46) 

holds with probability 1 — o(l). This implies that, since e is small enough, 

Act F~<(fi) y£ 0. 

Recall (|4.32p . then we know that the following event holds with probability 1 — o(l): Jl is not the 
eigenvalue of H s + jE for any < 7 < 1. 

If we let /i N be the the largest eigenvalue of H s + for < 7 < 1, then by definition An = 
since An is the largest eigenvalue of H s . With the continuity of // N with respect to the 7, we find 
that Jl is not the eigenvalue of H s , hence we have that, for any < 7 < 1, 

file (An-N-V^An + N-^ 4 ) (4.47) 

holds with probability 1 — o(l). Thus, we have proved ()4.26j) . which implies the desired result (|1.5|) . □ 



5. Basic ideas for Theorem 13.61 and Theorem 13.71 

The basic idea of proving Theorem 13.61 and Theorem 13.71 is Green function comparison method, 
as mentioned in the introduction. To apply the comparison results, first we show that for any H in 
Theorem 13. 6( there exists a matrix H whose entries have the same first four moments as those of H 
and satisfies the bounded support condition with large q ~ 0{N 1 / 2 / logN). Roughly speaking, the H 
has the properties that we need to prove for H and we will use Green function comparison method to 
show that H has the same properties, since H and H have the same first four moments. 
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Lemma 5.1. For any generalized Wigner matrix H under the assumptions of Theorem \3.6\ there 
exists another generalized Wigner matrix H , such that H satisfies bounded support condition with 
q = 0{N 1 ^ 2 / log-ZV) and the first four moments of the off-diagonal entries of H and H match, i.e., 

E4=E3* (i^j), k = 1,2,3,4 (5.1) 
and the first two moments of the diagonal entries of H and H match, i.e., Ex^ = E.t^, k = 1,2. 



Proof of Lemma \5.1\ The diagonal part is trivial. For the off-diagonal part, it clearly follows from the 
next lemma. □ 

Lemma 5.2. For any C > 0, if C < \ A\ and B > A 2 + 1, there exists a random variable x such that 

E(x) = 0, E(x 2 ) = 1, E(x 3 ) = C, E(x 4 ) = B, (5.2) 

and 

supp(x) c [-DB, DB] (5.3) 

for some D depending only on C . 

Remark 5.3. The condition B > A 2 + 1 comes from the simple fact that if E(.t) = and E(x 2 ) — 1 
then E(a; 4 ) > 1 + (E(a; 3 )) 2 . 



Proof. For fixed A and any B < k, it is easy to find a distribution such that (|5.2[) and (|5 . 3[) hold with 
D depending on k. Therefore, one only needs to show that this lemma holds in the case that B is large 
enough. To show that, first we introduce a family of random variables Yt(t > 1), whose distribution 
has a finite support 

Bupp(Y,)e j-i,-^L,-^L,<J (5.4) 

and satisfies 

One can easily check that every odd moment of Y t vanishes and 

E(r t ) 2 = l, HYt) 4 =t. (5.7) 

Note that Y t is supported in [— t,i\. 

We choose another random variable X whose distribution is supported on 



{V2A - Vl + 2^ 2 , V2A + ^l + 2A 2 } 

and satisfies 



(5.8) 



F(X = V2A + VTT^) = -^ + VT + ^. (5.9) 

2V1 + 2A 2 

Then, simple calculation shows that 

...V (I. EX 2 = 1, EX 3 = 2V2A, EX 4 = 8A 2 + 1. 

Choose 

t = AB - 8A 2 - 2 

and let 

x = X + Y t . 

Since X and Y t are independent, it can be easily check that x satisfies (|5.2[) and (|5.3[) . especially, for 
large enough B, supp(x) C [~5B,5B]. □ 
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Since H has the property we need (see Lemma l3.1j) . we now compare H with H using Green function 
comparison method. 

To prove Theorem 13.61 and Theorem 13. 71 we claim the following two lemmas first, which will be 
proved in the next section. 

Lemma 5.4. Let H and H satisfy the assumptions of Lemma \5.1\ For z G S(C) in (|3 . 1 1) with large 
enough C > 0, if for deterministic number X and Y , 



max |Gjj(2)| < X, \m — m sc \ < Y 



(5.10) 



holds with 3-high probability (see Def. \2.3\) , then for any p G 2N with p < (p, we have 

E|m - m sc \ p < E|m - m sc \ p + (Cp) Cp (X 2 + Y + ^AT 1 ) 35 . (5.11) 

Lemma 5.5. Let H and H satisfy the assumptions of Lemma \5.1\ Let F : K — > R be a function whose 
derivatives satisfy 

sup|^ (Q) (:c)l(l + Mr Cl <Ci, a = 1,2, 3, (5.12) 

X 

with some constant C\ > 0. Then, there exists a constant e > 0, depending only on C\, such that, for 
any e < e and for any real numbers 

E,E U E 2 G {x : \x - 2| < N- 2/3+e } =: I e , (5.13) 

and setting r\ := N~ 2 / 3 ~ e , then 



< N -<H-Ce t z = E + ir, 



and for E 1 , E 2 G [2 - Af- 2 / 3 + e , 2 + iV- 2 / 3+e ] , 

E 2 \ 

Ei^ ( TV / Im m(y + i?/) J - EF ( AT / dylmm(y + ir]) 

Ei 



(5.14) 



(5.15) 



Proof of Lemma \3.b\ Recall that, with (|3.4p . we have that for some large C, with high probability, 

H < 2 + ip c (N- 2/3 + q- 2 ). (5.16) 

First, we improve this result to that 

H < 2 + ip c N- 2/3 (5.17) 

holds with high probablity. Let H match H in the sense of Lemma [5.11 For a > 12, with (|3.7[) and 
(|3.8[) , for 2 = E + iri satisfying 

2 + <p c N- 2/3 < E < 2 + q-\ r) = tp {3+a ' ) N- 1 K- 1/2 , (5.18) 

we have with 3-high probability 

1 



\m - m sc \ < 



Imm(z) < — — — , max|Gi 7 | < 



(5.19) 



<£ a iVy y ~' ~ ip a Nr]' i^f yl ~ ip a / 3 Nr/' 

Assume that C in ()5 . 1 8|) is greater than 8 + 4a. Then, ip 2+a r] < k. With the property of m sc in (|2.12|) . 
we have Im m sc ~ tj/t/k, which implies 

1 



Imm sc (z) < 



ipNrj 



(5.20) 



Now, we apply Lemma I5T41 on H and H with z in (|5.18p . p — <p, X — {Lp a / 3 Nif) 1 and Y — {(p a Nri) 1 . 
Then, with (|5.11l) and Markov inequality, for some C > 0, we obtain with high probability that 
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From (|5.18|) . we know 

^0(l )7V l/3 < {N7]) -1 < ^0(1) N -H2_ (5 22) 

Inserting it into (f53Tj) and choosing a = C+l in (f53Tj) , we obtain |ra-m sc | < (Nr))' 1 . With (|5T20|) . 
it implies that Imm(z) <C (Nrj) -1 holds with high probability. By definition, 

Imm = A^ _:L ^r?((Aa - E) 2 + r i 2 )~ 1 . 

a 

Then Imm(z) <C (Nn)^ 1 implies that there are no eigenvalues in the interval [E — rj, E + 77]. Since 
it holds for any z in (|5 . 18[) with high probability, we obtain that there are no eigenvalues in [2 + 
ip c N~ 2 /' 3 , 2 + q^ 1 }. Together with (|5.16[) . we obtain (|5.17[) . By symmetry, we have 

\\H\\ <2 + tp c N- 2/3 . (5.23) 

Next, we apply Lemma EH] again on H and H with z in (|3.2[) - fl3.3[l . p = <p and 



^ V JVr? JVr? iVr/ 

where X, F follows from (|3.2|> - (|3.3|l . Then, with (|5.11[) and Markov inequality, we have that for some 
C = 0(1), 

max Im(z) - m sc (z)\ < Lp c {Nny 1 (5.24) 

zeS(C) 

holds with high probability. Following the argument of section 5 in [26], which was used to prove the 
(2.25) and (2.26) in [26], we obtain Lemma [3.61 Note that we can almost take the varbatim except 
some cocfficncts, where ip c plays the role of Tn there. □ 

Proof of Theorem \S.7\ We first recall the following lemma which is basically proved in [53] . 

Lemma 5.6. Suppose that two generalized Wigner matrices H w and H v satisfy with high probability 
that (|3~T2"j) . (j3~T3) . (EH, and 

\N 2/3 (\ N -2)\<p c (5.25) 
and the number of eigenvalues in [2 — 2ip c N~ 2 / 3 , 2 + 2ip c N~ 2 / 3 ] is bounded as follows: 

for some constant C . If, moreover, they satisfy the conditions from (|5.12[) to (|5.15|) . then there exists 
a constant 8 such that, for any s > 0, 

P w (at 2/3 (Aat-2) < s-N- s ^j-N- s < F v (N 2/3 {X N -2) < s) < (n 2/3 (\ n -2) < s + N'^ +N~ S . 

(5.27) 



Proof of Lemma \5.6\ Though this lemma is not explicitly stated in [26] , this is the basic structure 
of proving the edge universality, Theorem 2.4, in [25]. In the section 6 of [35], the edge universality 
problem was converted into proving Theorem 6.3 in |26j . The conditions from (|5.12l) to (|5.15[) in this 
paper are exactly the same as Theorem 6.3 in [26]. To obtain this conversion, in section 6 of [26], only 
the assumptions in Lemma 15.61 of this paper was used. To help readers compare, we note that they 
are (2.19), (2.25), (2.26), (6.2), and (6.3) in [H]. □ 



Now, we return to prove Theorem 13.71 We apply Lemma [5.61 with H = H v and H = iJ w . Clearly, 
it only remains to check (|5.25[) and (|5.26[) . One can see that it follows from (|3.12[) , the rigidity of 
eigenvalues, and that 7jv = 2. Then, with Lemma [ 



N- s , 



f(n 2 / 3 (\ n - 2) < s - j\r 4 ) - N~ s < F(N 2 / 3 (X N - 2) < s) < f(n 2 / 3 {\ n - 2) < s + + 

(5.28) 
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where Xn denotes the largest eigenvalue of H. Furthermore, since H satisfies the bounded support 
condition with large q ~ N 1 ^ 2 / logiV , with Lemma f3.5[ we obtain (|3.14|) and complete the proof of 
Lemma 13.71 □ 

6. Proof of Lemma EH1 Lemma and Lemma EH] 

To prove Lemmas 15.41 and 15.51 we will again use Green function comparison method. Recall the 
notations in ([3~?T]) -([3~25 )) with H = H = H w and H = i? 7max = H v . We let S = G 7 = (# 7 - z)' 1 , 
R = (Q - z)~ x , and T = G 7_1 = (if 7 _i - z) -1 , where Q depends on 7 = $(0,6). Note that £T 7 
satisfies the bounded support condition with q = N^^ for all 1 < 7 < 7max- 

From Lemma \3. II and (|3 . 28[) . we know that there exists a uniform constant Co > such that, with 
3— high probability, 

max maxmaxmax{\Sij(z)\,\Rij(z)\,\Tij(z)\} < 2, (6-1) 
zeS(Co) 7 «>j 

where we used the bound |m sc | < 1. We note that the uniformity here is easy to check, since there are 
only finitely many different distributions for all the matrix entries of H~ n 1 < 7 < 7 max . On the other 
hand, S, R, and T satisfy the following trivial bound that always holds: 

max max max max{ I (z)\, \Rij(z)\, \Tij(z)\} < v^ 1 < N. (6-2) 

zeS(Co) 7 i,j 

In this section, for simplicity, we use the notation |k| = ||k||i for any vector k = M. n , n G N. 

To illustrate the idea of Green function comparison method, we first consider the following simple 
example of finding a bound on EG^ from an a priori bound on G^ : 



Example 6.1. Suppose that the bound 



max|Gy| < N~ 1/3 (6.3) 



holds. Then, we have that |EG i3 -| < GiV" 1 / 3 . 
Applying the replacement strategy, we obtain 

7m ax 

EG,, = EGij + ( EG « - EG^ 1 ) . (6.4) 

7=1 

Note that the bound (|6.3[) implies that (EG^ I < N^ 1 / 3 . Thus, if we can prove that for 1 < 7 < j max 

\~ESij —ETij\ < G'N~ 7 / 3 , (6.5) 

then this will show the desired estimate on |EGy -|. 

We now expand Sij in terms of R and V, as in (|3.28p . using the resolvent expansion 

S = R - RVR + {RVfR - {RVfR + (RV) 4 R +■■■ + (-l) m (RV) m R + {-l) m+1 (RV) m+1 S, 

with m := [l/</>]. Since each element of V is bounded by with high probability, from (|6.1[) . we 
find that the last term in the expansion is 0(N~ 1 ). Taking expectation, we find that 

m 

ESij =EY^[{RV) k R] ij + OiN- 1 ). (6.6) 

Note that R is independent of V. We can decouple R and V by taking partial expectation E a {, with 
respect to V a b, which gives 

4 

ES^ =Ej2A k E ab (V* b ) + 0(N- V ), (6.7) 

k=0 

where Ak depends only on R. For example, A5 contains a term such as Ri a RbbR a aRbbRaaRbj ■ 

From the moment matching condition, we know that the first four moments of H and H coincide, 
thus the terms with k = 0, 1, ■ • • ,4 will vanish when we estimate ESij — ET^-. Moreover, since we 
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know that \E(V* b ) \ = O(iV~ 2_0/2 ), it suffices to prove the bound \EA k \ < CN- 1 / 3 ^/ 2 for 5 < k < m. 
Now using an expansion such as (see (|6.24|) ) 

S S Til 

El[R ltJt =El[S ttJt -Ej^AMV^ + OiN- 1 ), 
t=i t=i 1=1 

where Ai are products of S, we write the expectation of the product of the elements of i?'s (like Ak) as 
the sum of the expectation of the product of the elements of S"s. Hence, we can convert the problem 
into showing 

\EB k \ < CN- 1 '^' 2 , fc>5, (6.8) 

where B k is a sum of the product of the elements of S. For example, B$ contains a term such as 
SiaSbbSaaSbbSaaSbj ■ Note here we effectively gain a factor of N~^/ 2 in the required bound. 

We now repeat the replacement argument with the terms in B k , since the matrix H 7 also satisfies 
the same bounded support condition as H. We consider a telescopic series, as in (16.41) . 

7 

EB k = EB k (S -> G) + (^B k {S -> G 7 ') - EB k {S -> G 7 '" 1 )) , (6.9) 

7 '=1 

where the notation B k (S — > G 7 ) means that we consider the product of G 7 = (Hy — z)^ 1 instead of S 
while keeping the indices the same. The first term in the r.h.s. of (|6.9[) will be the sum of the products 
such as EGi a GbbG aa GbbG aa Gbj , and in a generic case, it contains at least one off-diagonal term of G. 
In particular, we have \EB k (S — > G)| < GiV -1 ' 3 . Hence, we are left to estimate the telescopic sum, 
where we use the argument above, which involves the resolvent expansion and partial expectation, 
again. Note that, each time we repeat the procedure, we effectively gain a factor of N'^l 2 in the 
required bound. Therefore, after repeating the procedure sufficiently many times (i.e., O(l/0)-times), 
we find that it suffices to prove that |E£?| < G, where B is a sum of the products of the elements of 
G 7 , 1 < 7 < 7max- Since this is trivial from the bound (|6.1|) . we find that the bound < CN" 1 ^ 

holds. 

We now prove the lemmas using the ideas explained above. We first introduce some notations for 
simplifying the expressions, which will helps us study the expectation of the product of SV, 's. 

Definition 6.2 (Matrix operators * 7 and *). For a 7 with $(a, b) = 7, we define A * 7 B as 

{A * 7 B)ij = (AXyB)ij, (l^ij = l{i,j}={ a ,b} (6-10) 

When a 7^ b, it satisfies 

(A * 7 B)ij = A ia B bj + A lb B aj . (6.11) 

We will often drop the subscript 7 for convenience as A * B. For simplicity, we denote the fc-th power 
of A under *— product by A* k , i.e., 

A* A* A* A---* A = A* k (6.12) 
Definition 6.3 ("P 7ik and V lt k). For k G K and k = (ki, k 2 , ■ ■ ■ ,i s )el s , 7 = $(a, 6), we define 

V^kGij := G*/ (fc+1) (6.13) 

and 

^.k (f[Gn 3 ) ■ \[>r.,.C,. r ; = f[G*£< (6.14) 

\t=i / t t=i 
If Q\ and Q2 are products of matrix entries of G's as above, then we define 

P 7) k(Si + £ 2 ):=7> 7 ,k ^1 + P 7 ,k 02 (6.15) 
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Note that P-y.k and V~f,k arc not linear operators but just notations we use for simplification. Similarly, 
for the product of the entries of the matrix G — m sc — G — m sc I, we define 



\t=i 



where 



V 7ik (G lj - m sc )i 



(Gij - m sc ) 



i j 

ij 



if i = j and k = 0, 



T-'i.kGij = Q*j( k+1 ) otherwise. 



Using Definition 16. 14[ we may write, for example, 

^(n^l =ii s ;£ t+1) > ^fn^l =irc 



,(fc t +i) 



(6.16) 



(6.17) 



(6.18) 



t=i 



(6.19) 



With the fact that G* S I 7 G** = G* (s+t) , one can easily see that for fcel and k G 

'P-y,it('Pj,kGij) = V 7 ,k+\k\Gij 

Here, note that ("P 7 ,fcG,j ) is the sum of the products of the matrix entries of G, where each product 
contains k + 1 matrix entries of G. 

Using the definitions above, we have the following lemma from the bound (|6.1[) . Recall that R, S, 
and T depend on 7. 



Lemma 6.4. For any k G R s , 7, 7' and • ■ ■ , i s ,js, we have with Z-high probability that 



< 4 s+|k|+l 



(6.20) 



where A can be R, S , or T . 

The following lemma shows how we expand the expectation of the term Si 1 j 1 Si 2 j 2 ■ ■ ■ Si g j s . 

Lemma 6.5. Let S = (H 1 — z)" 1 as above and $(a, b) = 7. Assume z G S(Co) for Go in (|6.1j) . Fix 

s = 0(ip) and £ = 0((p). Then, for any 

Siljl ( z )Si232 ( z ) ' ' ' Sisjs 

we have with a = (011, eti, - ■ ■ , a s ) el s , |a| = ^ <%i> 

s 5<|a|<2C/0 s 



E II 5< ** = A a E[{-v ab ) k ] + ,A a EV lia l[S itjt +0(N- 



(6.21) 



t=l 0<fe<4 ai,a 2 ,...,a«,>0 t=l 

where Ai (0 < i < 4) depend only on R, A's are independent of {it,jt)< 1 <t < s, and 

\A a \ < N -H<t>/w N -2^ ( 6 22 ) 

Similarly, we have 

s 5<M<2C/0 s 

®T[((S - m sc ) itjt ) = A a E[(-v ab ) a <] + £ A a EV lta \[S ltn +0(N^), (6.23) 



0<a<4 



Ql ,CK2 ,...,Ct s >0 



where again Ai (0 < i < 4) depend only on R. 
Furthermore, as (|6.21[) . we /lave 



s s l<|a|<2f/0 s 

t=l t=l ai,Q 2 ,...,a a >0 t=l 



(6.24) 
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where A are independent of (it,jt), 1 < t < s, and 

\A a \ < N'^/ 10 . (6.25) 
Note that the terms A and A depend on 7. 
We prove this lemma later in the section. 

Recall that we let S = G 7 = (ff 7 - z)- 1 , T = G 7_1 = (# 7 _i - z)' 1 . Note that the entires of H 1 
and Hy-i coincide except for the (a, b) and (&, a) entries, where (£T 7 )ab = (if 7 )ba = «afc = ^ah and 
(if 7 _i) a & = (i? 7 _i)i, a = w Q 6 = ft-ab. It is obvious that a result similar to Lemma [6.51 holds for the 
product of T. Thus, as in (|6 . 2 1 1) we define the notation A" 1 '" 1 , a = 0, 1 as follows: 

s 5<|q|<2C/0 s 

= E ^ a E[(-v ab ) k }+ A^ a EV^ a l[S Hn +O(N-i), (6.26) 

t=l 0<fc<4 ai,02,...,a i >0 t=l 



8 5<|q|<2C/0 s 

t=l 0<A;<4 qi,q 2 ,...,q s >0 t=l 

Using these two identities, we have 
t=l t=l 

= Ef E n G ^- E fl G ^l (6.28) 



= e e ^'° e ^ n ^ - ^ <r. M n g ^ + o(^- c ) 

7 k:5<|k|<2C/</> V=l / \t=l / 

where we used that (0 < k < 4) depends only on R and the first four moments of v a b = (-ff 7 ) a b 
and w a b = (Hj-i) a b match. Then, we obtain that 



■ IT G Sr 



< 



EE E w a \ 

7 o=0,l k:5<|k|<2C/0 



+ 0{N- C ). (6.29) 



For the terms that belong to the fixed |k| = k, we can see from (|6.20j) and (|6.22|) that they are bounded 
by 



''■■'II KG) 7 , Ktp 



(6.30) 



Then, the second part of the r.h.s. of (|6.29[) is less than CN 21 4 s where we used k > 5. 

Recall that P 7 k (IIt=i ^"l~jt) ^ s a ^ so a sum °^ * ne P r °ducts of G. Using the result (|6.28[) again on 
the |E"P 7) k (rit=i GJdt) \- wriere we replace the 7 max with 7 — a in the left hand side, we obtain the 
following bound as in (|6.29|) : 



\t=i 

E7> k [f[G 



< 





ll.il 



EE E i^'' a 'i 

7' o'=0,l k':5<|k'|<2C/0 



+ 0(7V"«), 

(6.31) 
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where k' G R s+ I k l. Thus, together with (|6.29|) . we have 



E n G 5r 



< 



e n G in 



E E W 

k:5<|k|<2C/0 7,<* 



7', 7 a, a' k' ,k 



\i=l 



(6.32) 



+ 0(N-<). 



Since |.A k | < (JV~ 2_ ^l k l/ 10 ), for the terms, in the second line of ff6T32]) . that belong to the fixed 
|k| + |k'| = k, as in ()6.30j) . it is easily to be bounded by 

]\j-^^ k+s < N~^A S . 

Hence, the sum in the second line of (|6.32[) is less than jV~^v4 s + 0(7V - ^), where we used that 
|k| + |k'| > 10. Repeating this process, we make the sum smaller and smaller. At the end, we obtain 
that 



1=1 

6C/0 



(6.33) 



+ 0(N-<), 



^E E E E HI^H 

n=0 7i,72,"- ,7n 01,02, ••• ,&n k i, k 2--- ,k„ j 

where 

n<6C/0, kxGR 8 , k 2 el s *l k 3 G M s+ I kl l + I k2 l, etc., and 5 < |k*| < 2C/</». (6.34) 
Using the bound |.4 k | < (jV- 2 -*l k l/ 10 ) again with s, ( < 0(ip), we have 



=1 

e n G ih 



< 



t=i 



+ max(iV- 2 ) rl (^" 0/2O ) l: ' |ki 



E 



7li72,"' ,7n 



Ep Tn , k „---p7 1 , kI n G L 



+ o(iv- c ). 

(6.35) 



Note that the first term in the right hand side is from the sum with n = in (|6.33j) and the 7 max in 
the left hand side can be replaced with any < 70 < 7max- 

Since these A and V are independent of it and j\ (1 < t < s), we may consider a linear combination 
of (|6.35[) . i.e., for a coefficient function /(/, J) with 



£)/(J,J) = l, f(I,J)>0, 7=(i 1 ,i 2 ,--- ,*.), J={juh,--- ,js), (6.36) 

we have 

E/^ J )n G S: 



< 



;E/( J ' J )II G L 



+ max(A^/ 20 )^l ki l 

k,n,7 



In application, we let # = iJ 7max , # = 



E^/(/,j)p 7n , kn ---p 71 , kl n G u 



(6.37) 
+ 0(A^ C ). 
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Similarly, with (|6.24j) we can extend (|6.37|) to 



j, j *=i 



< 



E E/( j ' j )ri(( G - m -)^) 



■ max(iV-^ 20 )^. I k 'l 

7,n,k 



E£/(J, J)7\,, k „ ■•■P 7llkl n(5-m.c) Wl 



z.j 



(6.38) 

+ 0(N- C ). 

ir ) s and the first term 



Proof of Lemma \3.8i Let H — H 1 ™ 1 ^ and H = H°. Since if satisfies the bounded support condition 
with q ~ jV _1 / 2 /log AT, we have from (J3T3J) that, for z satisfying the assumption in Lemma \3.8\ with 
high probability, 



We note that, in the case / = iV~ s J\ 5i t j t > the left hand side equals to E(m — 
in the right hand side equals to E(m — m sc ) s . 
Now, we first use (|6.33l) to prove Lemma [3~8l 



icy 



Imm sc 1 
Nrj + Nr] 



25i 



(6.39) 



On the other hand, we have a trivial bound \G %] \ < N. (See ([BT2]).) We now apply (|Q3)) on GyGy 
with s = 2 and £ = 1. To prove the lemma, it suffices to show that the following holds for any ki, 
1*2 •• • , k„, for n satisfying <|6.34[) : 

/ Im m sc 1 



N~ 



E 



7n Mi 



■ ■■n 



< 



71:72, ••• -In 



(N V y 



(6.40) 



With Lemma [2~B1 it is easy to check that the right hand side is larger than N 1 . Let $(74) = (a t , bt). 
It only remains to prove that 

/ Im m sc 1 



max 

1l,72,— i J^Ui< t <„{a t ,6 t } 



EV. 



^71 ,ki 



V N v 



(Nr,)- 



(6.41) 



By definition, V-y n ^ n ■ ■ ■ V^^GijGij is a finite sum of the products of the matrix entries of G and G. 
Furthermore, for each product, there exist at least two off diagonal terms, since the index i appears 

exactly twice and there is no Gu term in "P 7n; k„ ■ • ■T' 11 ^ 1 GijGij. From the existence of these two 
off-diagonal terms and from (|6.39[) , we obtain (|6.41[) and complete the proof of Lemma 13.81 □ 



Next, we use (|6.38j) to prove Lemma EH 

Proof of Lemma \5.4\ For simplicity, we prove instead that 

|E(m - m sc ) p \ < |E(m - m sc ) p \ + {Cp) Cp (X 2 + Y + ip c N" 1 ) p . (6.42) 

(The proof of (|5.11[) is exactly the same except that it involves more terms with more complicated 
expressions.) Using (|6.38j) with i t — jt and s = £ = p and f(I,J) = N~ p Y[ t Si t j t > since A are 
independent of it,jt for any 1 < t < s, we have 



|E(m - m sc ) p \ < |E(m - m sc ) p \ 



7,k,n 



E ^ E ^7„,k„---^7 1 ,k 1 (n(^.H-"» TC )J 

ii— ,i p \t=l / 



(6.43) 

+ o(iv- c ). 
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With the assumption (|5.10j) . the first term in the right hand side is bounded with (Y + ip c N~ 1 ) p , 
where for the bad event of probability space we used (|6.2j) . In order to complete the proof, we now 
only need to bound the second term in the the right hand side of (|6.43|) . For any fixed 71 , • • • , 7 n , 
ki, • • • , k n and i\, ■ ■ ■ ,i p satisfying (|6.34[) . we know that 

(jJ(Gi t ,i t -m sc )J (6.44) 

is the sum of at most (7^' ki ' products of G%j (including the terms with i = j) and (Gu — m sc ), 
where the total number of and (Gu — m sc ) is ^2 + P = 0(ip 2 ). Since G has a rough bound 
\G(z)\ < < N, we know ([5T34]) is always less than N°^ 2 \ With the assumption that (|5~TU)) holds 
with 3-high probability, we noticed that the event that (|5.10[) does not hold is negligible. Futhermore, 
for each product in (|6.44p and any fixed t, 1 < t < p, we know there are two it's in the indices of G. 
These two it's can only appear as (a) Gi t i t — m sc in the product, or (b) Gi t<a Gb,i t , where the indices 
a and b come from some jk and 7/, (1 < k, I < n) via V. Thus, after averaging over 1 < it < N, 
this term becomes (a) rh — m sc , which is bounded by Y (see (|5.10j) ). or (b) iV" 1 J2i t Gi t , a Gb,i t , which 
is bounded by X 2 + CN^ 1 with (|5.10[) . In the case (b), we also used the fact that the number of 
non-generic terms with i t = a or i t = b is smaller than that of generic terms by a factor TV -1 , hence 
we bound the contribution from the non-generic terms by CN^ 1 . 

Therefore, we have showed that after averaging 1 < ii,i2) • • • ,i p < N, i.e., applying N~ p X^... i i 
each i t either contributes a factor m — m sc , i.e. Y, or TV -1 Gi ta Gbi t -, i.e., (X 2 + CN^ 1 ). For any 
other G's in the product with no i t (1 < t < p), we simply bound them as C, then for any fixed 
71 , • • ■ , 7„ , ki , • ■ ■ , k„ , we had proved that 



Np . 

ll*" i^p \t=l 



< c^i\ k *\+P(X 2 + Y + CN~ 1 ) p . (6.45) 



With (|6.43|) , this completes the proof of Lemma 15.41 □ 
Proof of Lemma \6.5[ Choose 



£ = 2(/<t>. (6.46) 

We apply the expansion 

S = R- RVR + (RV) 2 R + (-l)£(RV)tR+ (-l) i+1 (RV) s+1 S. (6.47) 

With the condition \v a b\ < N~^, we note that the last term in this expansion is 0(A r_< »). Thus, 

S = R- RVR + (RV) 2 R + (-l)t(RV)*R + 0(N~^). (6.48) 

Since Vki = if {k, /} ^ {a, 6}, we have 

([—RV] m R)i t j t = E Ri t a 1 Va 1 b 1 Rb 1 a 2 ■ ■ ■Va m b m Rb m j t ■ (6.49) 

(a j ,6 i )e{(o,6),(fe,o)}; l<i<m 

We note that V a b — Vba = v ab = h a b. Using Definition 16.21 and Definition 16.31 we have 

[-RV] k R = i?* (fc+1) (-M'% s = Y (R* (k+1) )(-h ab ) k + 0(N-*+). (6.50) 

0<fc<£ 

Similarly, 

R U= Y (P-y,kSi j )(hab) k + 0(N-t' 1 '), (R-m sc ) ij = Y <T>rM(h >A ) k + 0(N-e*). (6.51) 

0<fc<£ 0<fc<£ 

For this reason, we only show the proof of (|6.21[) , and (|6.23[) can be proved analogously. The proof of 
(|6.24[) will roughly be explained at the end of this proof. 
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Using (|6.50p and (|6.1j) . with 3-high probabilily, we have that (with definition 

I[Si t h= E E ^n%| (-h ab ) k *+0(2 s N-^) (6.52) 

t=l 0<fc<gs; keif fc V t=l / 

where 

k:= (k u k 2) --- ,k s ), 4 a iC :={keK a :0<^<6,E^ = c } (6-53) 
Note that, from the above definition, 

\ISJ <a c . (6.54) 

We note that the term in (|6.52[) belonging to k = is Yit=i Rhh- For the terms belonging to k > £, 
using (|6.20[) . we know that with 3-high probability they are bounded by 

E s k A k+s N~ k<t ' < 0(« c 4 c+ 'j\T w ), (6.55) 
where s fe comes from X)ke/ S • Hence, 

s s I s \ 

n^=n>«*+ E E ^.kll^ (-M fc + 0(s«4«+ s iV-^). (6.56) 
t=i *=i i<fc<c \ke/| k t=i J 

Recall that (S* s )ij is a sum of terms as in Definition 16.21 As a special case, consider a term in 
(S* s )ij = V~[, s -iSij and rewrite it as Si 1 j 1 Si 2 j 2 ■ ■ ■ Si s j s . We have 

(5*% = (JT')y + E f E (^,k(fl M )«)J (-M fc +0(^4«+*iV-^). (6.57) 

Then, with (|6.19|) , we have (here we replaced s with s + 1 for simplicity) 

V- y , s S ij =V^ s R ij + E \ I ^ 1 \('P'r,'+kR)ij(-hab) k + 0(8^ + 'N-^), (6.58) 
i<fe<? 

Define, for < k < 4, 

A k := E = E ti<n +1) ( 6 - 59 ) 

ke/| fc keJ|, fe t=i 

Clearly, they depend only on R. Thus, as in (|6.56|) . with 3-high probability, 

s 4 / s \ 

II = E ^(-^) fe + E E I] + 0(.s«4« +s 7V-^). (6.60) 

t=l fc=0 5<fc<£ \ke/| k t=i J 

We take the expectation E in the both sides of the equation. Recall that the good event holds with 
3-high probability and the entries of S and R are bounded by ry _1 < N. (see (|6.2[1 ). Furthermore, in 
this proof, no products have more than 0(ip 2 ) entries of S or i?'s. Thus, when taking the expectation 
E, we can simply ignore the set of bad event. 
To simplify the notation, we define 

n k :=E(-h ab ) k . (6.61) 

Then, we get 

s 4 / s \ 

, E"— U ' E n M E ^,ul[Ri tjt ) +0(s^+ s N-^) (6.62) 

t=l fc=0 5<fc<£ \ k G/| fc *=1 / 
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To estimate »*V/\.kH/ Ri t h ~ ^Yit T~ > ~t-k t Rith i we nrs t use ^|6.58|) and obtain 

V 1 , kt Si th =V lM R itjt + l / ^!j^7,(fc t +M^ t (-^) it +0((4fc t +4)«- fc 4 fc '+ 1 ^+^). 

l<h<i-k 

(6.63) 

Note that from (|6.54p we have a bound 

\I^ h \<(h + l) h . 

From that \P~ hkt S ltH \ < 4 fct+2 (see (jOOT) ). for < k x , k 2 , ■ ■ • , k s < (/</>, h = k and k < £/<f>, we 
obtain that 



^11^=^11^ + E E U^-X^M+hRuh )(-h ab ) 

t=i t=i i<i<(S-fc)« \ie/|_ M * / 

+ 0((4fc + 4)«- fe 4 fc+s iV-^ + ^). 

For the terms that belong to I > £ — k, from (|6.20j) and (|6.54p . they are bounded by 

]T s'(fc + l) J 4 fe+ ' +s iV-^ < (sfc + s )«- fc 4« +s iV-^+^. 
i>£-fc 

Then, the upper bound of I in (|6.64[) can be reduced to £ — fc as follows: 



t=i t=i i<i<5-fe UeJ|_ fe , * 

+ 0((sfc + s )«- fc 4« +s iV-^+ fe ^) 

We observe that 



4 



+ 0(£(s£ + s)'4 s+ «iV-^) 



(6.64) 



(6.65) 



, l=(h,l 2 ,--- ,1.). (6.66) 
Inserting it into ([6"l)2]t . with |/i o6 | < TV - ^ and |J| fc | < s k : k<£, we get 



E n^=E^ E ^+ E n M E ' :7r x II s '-. ( 6 - 67 ) 

t=l fc=0 5<fc<£ \ k6/ |,fc * =1 / 

-EEw e e fniwjVf^nO 

fe=5 J=l VkeJf .16/1 t , V ( / \ * / 



where the single £ factor in the last term comes from J\ . For fc > 5, define 

s 

4fe = E P^n^V (6-68) 
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Clearly, it has at most s k terms having the form of V Yl S. Applying (|6.65j) again on ET^k+i Jit R-Hjt ; 
as in (|6.65|) . we have 



t 



^ k+ .n^ t - e e (um^x 

t l<o<f-fc-i \oG/|_ fc _, \ t 



|^ P 7 ,k+l+o R hj^j i-hab)° (6.69) 



+ 0((s(k + l) + s )i- k - l 4i+ l + s N ( --i +k+l ^) 

We now insert it back to (|6.67p . replacing the notation k, I, o with ki, k 2 , k% and kt, It-, ot with fci(i), 
^2(t)i ^(t), respectively. Using 

K^l E E f[\I^ + k j t \<N-^ S k S l (k + l) 1 , (6.70) 
ke/|. fc ie/f_ M t=i 

we obtain 

s 

t=l 

fei=o fc 1= 5fe 2 =i \k ie /| fci k 2 e/|_ fci fc2 \t=i / t 

£ S-kiS-ki-k 2 

+ EE E n k x n k% n k3 X (6.71) 

fci — 5 &2 — 1 fc3=l 

ee e fri i^wii^-SSwi) ^* + * + * n** 

^iGi| (fel k 3 ez|_ feitfc2 kae/|_ fel _ fe3)fc3 \t=i / * 

+ 0(£ 2 (s£ + s) c 4 s+ «iV-^), 

where the factor £ 2 comes from ^ fc J^ fe . Define 

^ : =- e e fni^&jiW^+taii^- ( 6 - 72 ) 

kiei| ifcl k a e/|_ fciiJka \t=i / t 

Clearly, letting k := k\ + k 2 , we find that A kl ^ k2 has at most 

s 

s k 1+ k 2 -Q |/^W+i (t) | < s fcl+fc2 (fci + l) fc2 < s fc (fc + l) fe < C(sk) k 
t=i 

terms of the form V Y[ S. 

We repeat the previous procedure £ times. Recall that ki := J2t ki(t). Let 

ki := h +k 2 + ■■■ + k l: ki{t) :=k 1 (t)+k 2 (t) + ---+k i (t). (6.73) 

Define 

s /n-1 _ \ 

=(-ir i e e e - e n nitix.jr^^n 5 

(6.74) 



t 
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Let #Ak lt k 2 ,k 3 --- ,k n be the number of the terms of the form 'PJI'S' m ^fci,fc 2 ,fc 3 -- ,fe„- Clearly, with 

k := hi, 

#A klMM ... , kn < s k (k + l) k < C(sk) k . (6.75) 

Thus, we obtain that 

II s = E n k 1 EA k 1 + E E n kink 2 A klM + E E E n fci n fc2 n fc 3 ^fci,fc2,fc 3 

t=l fci=0 fe 1= 5fe 2 = l /c 1= 5fc 2 = l fe 3 = l 

+ ••• (6.76) 



fcc + o(^( s e + s) e 4 s+ «iv-^), 



where we sum up &i, k^, ■ ■ ■ ,k^ under the condition k\ > 5, fci > 1,1 < ki < £. The factor comes 
from J2 kl k2 The equation (j6.76[) implies (|6.21[) . 

Now we are ready to prove (|6.22p . Since a plays the role of (Y27=i ^i) in (|6.74[) , we have that 

\a\=Y a t = J2 ki ^ 6 - 77 ) 

t % 

Then, we obtain that 

\m< e #^i^,-,fc„n E i^i fct - 

Note that the Wigner matrix H under the assumptions of 13.71 (i.e., H v in Lemma I5T41 and 153)) satisfies 

\E(h ab ) k \ < (log7V)AT" 2 -( fc - 4 ^, k > 5. (6.78) 

With (pT75j) and that ki > 5, we obtain (|Q2"j) . 

Finally, we briefly explain the proof of (|6.24l) and (|6.25[) . It is almost the same as the one for (|6.21l) 
and (|6.22[) . except changing (|6.62[) to 

s s I s \ 

II s ll^u,- E E Wy*T[Rui t )+0(s^^ s N-^), (6.79) 

t=i t=i i<fe<i \ ke/ |.fc 4=1 / 

i.e., we move the k = 1,2,3,4 part from the first term in the right hand side to the second term. Then 
we keep using (|6 . 56[) and (|6 .65[) to estimate E"P 7 ,k Yit=i R-njt as m the proof for (|6.2ip and (j6.22[) . □ 

Last, we prove Lemma 1-5.51 

Proof of Lemma \5.5l For simplicity, we only prove (|5.14[) . The proof for f|5 . 1 5[) is similar. 
Recall that r/ = iV" 2 / 3 ^. Define 

x s := r] Im Tr S = rf ^ S^Sy, x R := r] Im Tr R = rf ^ R tj B~j- (6.80) 

Recall also that 5* = (H 1 — z)" 1 and R = (Q — where all the entries of H 1 and Q are the same 

except the (a, b) entries. Then, since the rank of (Hj — Q) is at most 2, by interlacing theorem, we 
have 

|TrS- TrR\ < Crf x (6.81) 
Together with (I5TM|) and (pn"2"j) . with high probability, 



max{|a; s | + \x R \\ < N Ce . (6.82) 

7 



From ([231) and (|3~^5j) . we find that 



m&x(\R i:j \ + \Sij\) < iV-^ +Ce + C6ij (6.83) 
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with high probability. We also have the trivial bounds 



x s = if S ij S ij < V 2 N 2 r,- 2 = TV 2 , x R < N 2 , \S\, \R\ < ri~ l < N. 



(6.84) 



Since the bad event is so small in probability space, in this proof, we basically ignore the bad set. 
Using the definitions we used in (|3.2f [) - (|3.25[) . we get a telescopic scries 

E F ( ? ' 2 E G ^ ) - E F ( ^ E C tJ G~ j = [E F (x s ) -EF (x T )] . (6.85) 
From the Taylor expansion, we have 



2 1 1 

F(x s ) - F(x R ) = Y ±-F^(x R )(x s - x R ) a + ^F^(( s )(x s - x R ) 
£ — ' «' 3! 



(6.86) 



where £s lies between x and x R , and we can obtain a similar formula for F(x T ) — F(x R ) with £t in 
place of (s- 

We now expand the term S^S^ using (|6.56[) , where the terms with the complex conjugate are 
treated in the same manner. Letting £ = 3/</> with s = 2 in (|6.56[) . we can see that 



SijSij = RijRij + Yl E V~ (M (R l3 R l3 )\(-h ab ) k + 0{CN- 3 ) 



(6.87) 



i<fe<3/0 \ke/ 



3/0, fc 



holds with high probability. Averaging over i,j and multiplying rj 2 to both sides, we obtain 

/ 



x s = x R 



E E E^.k (RijRij) (-h ab ) k + 0(CN~ 3 ). 



(6.88) 



i<fc<3/0 \keq /4>k .j 
Now, we claim that for any fixed k ^ 0, k £ J|, , fc , and p = 0(1) with p 6 2Z, 



E 



< (N 1+Ce ) p . 



Assuming the claim (|6.89|) . with Markov inequality, we find that, for any kg I 2 ^ k and k ^ 



\V l!k x R \ := |r? 2 ^P 7 ,k(%%)| < W" 1 



/3+Ce 



(6.89) 



(6.90) 



holds with probability with 1 — N D for any D > 0. 
For simplicity, we show the proof for 



< (7V 1+C,e ) p . 



(The claim (|6.89[) can be proved similarly.) Using (|6.24j) with s = 2p and £ — p, we get 

1 1 r. , k •Jt.jtj^) II/ 1 u (S^Aa) 

t=l t=l 

l<\a\<2p/4> / p \ 

Qi,a 2 ,...,a 2p >0 \t=l / 



(6.91) 



(6.92) 
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With (|6.25p . in order to show (|6.91j) . it only remains to prove that 



e e n^fesu) 



«l ,i P ,j P \i=l 

and for a e R 2p , 1 < \a\ < 2p/<j), 



< (N 1+Ce ) p 



< (N 1+Ce ) p . 



(6.93) 



(6.94) 



E E7 v n^.kfe^; 

,* P ,Jj> \t=l 

We give the proof for (|6 .93|) . The proof of (|6.94[) is the same except that it is slightly longer by one 
term of "P 7 , Q . Using (|6.3T[) . with 

f(I,J)=N- 2p , /=(!!,..• ,Q, J =(.]!,■■■ ,j p ), 



an d rit=i ^7.k (Si t j t Si t j t ) playing the role of n*=i ^iTjT m Q6-37p , G being G°, we have 

E E^ 2p n^.k(^A 



J,J t=l 



< 



Ej2N- 2p Y[V ltk (G itjt G itjt ) 
i, j t=i 



(6.95) 



max(7V-^ 20 )^. I k 'l 

k,n,7 



(Gi t j t Gi t j t ) 



i.j 



+ 0(N-() 



where 



k!elR 2p , k 2 el 2p+|kl1 , k 3 e M 2p+|kll+|k21 , (6.96) 

From (|3.3p and assumption on z in (|5.13[) , with high probability, 

\G ij \<N- 1 ^ +2e + 2S ij . (6.97) 

Now, we estimate the term 

(V 7 ,k(G ltJt G Mt )) (6.98) 

as in (|6.44p . First, it is the sum of at most G^' ki ' + ' k ' = 0(1) products of G ?J (possibly i = j), 
where in each product the number of G^ is X) l^il + 2p = 0(l). Since G satisfies a rough bound 
\Gij(z)\ < ir 1 < N, w e know (f6798|) is always bounded by . Since ([6^97)) holds with high 

probability, when estimating (|6.98j) . we may neglect the event that (|6.97j) does not hold. For each 
product of above type and for any fixed t, the indices i t and jt only appear twice each. Since k ^ 0, 

they cannot attain the form Gi t j t Gi u j t . Thus, they must appear as one of the following forms for 
some a, &, c, d, which comes from Vs: 



Gi t a,Gbj t Gi t j t: Gi t j t Gi ta Gbj t , Gi ta G\ J j t Gj :tC G ( ij t . (6.99) 
For each case, after averaging over 1 < i t ,j t < N, i.e., applying iV~ 2 J2i t j t > these terms are bounded 
by N~ 1+Ct . Thus, so far we have proved that, for each t, the term Gij with an index it or jt contributes 
a factor N~ 1+Ct to 

E^ 2 ^7„,k„---?V kl (v^{Gi tjt G itjt )) ■ (6-100) 

Similarly, the G^'s with indices ■ ■ ■ , i p , j p contribute a factor (N~ 1+Ce ) p to (|6. 100[) . By (|5.10[) . it 

is bounded with X + CN~ 1 . For the other G's without indices ■ ■ ■ ,i p ,j p , we simply bound them 
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by a constant C . Therefore, we obtain that (|6.100j) is bounded by (N 1+Ce y with high probability. 
Then, the expectation of (|6.100|) is less than [N~ 1+ ) p . Analogously, we can bound the first term in 
the right hand side of (j635j) by (N- 1+Ce )P. Thus, we proved and (Eg), which implies (jo3Tj) 

with (|6.25[) . We can complete the proof of (|6.89[) similarly. 

Now, we return to estimate x — x R in (|6.88|) . First, we note that 

E^ a6 i 3 < mh ab \ 2 nh ab \ A ) x ' 2 < (\o S n)n- 3 / 2 . 

With (|6.90[) . we can see that there exists a constant C such that 

E\x s - x R \ 3 < N" 5/2+Ce (6.101) 

for any sufficiently large N independent of 7. Together with the fact that Cs is between x and x R , 
we get \Cs\< N Ce (see with high probability, hence 

7o 



^e[f( 3 )(C5)(x s -x«) 3 



7=1 



where we have used (|5.12[) on F. We can estimate E [F^ ((t)(x t — x R ) 3 ~\ analogously. 
From (|6.85|) and (|6.86[) . it only remains to prove that, for 1 < s < 2, 



2 



f {s \x r ){x s - x R y 



F^(x R )(x T -x R Y 



Using (|6.88[) again, recalling (H 1 ^i) ab has the distribution of H ab , we have 



E 

l<fc<3/0 



E »? 2 E^,k(%%) I (-M fe + o(Civ- 3 ) 



with Eh k ab 



< 



Eh* b , (1 < k < 4). Thus, we obtain that 



E 



(xr)(x t ~ x R y 



(6.102) 



(6.103) 



(6.104) 



(6.105) 



9/0 



E E E =11 to* 



k = 5 ELi |kt|=fck t e/ 3 2 



(jE^Ml + |E(-M fc |) +0(CiV- 3 ), 



where in the right hand side we sum up k from 5. Since 



E{-h ab ) k < {log N) C N~ 5 / 2 and |E(-/i a6 ) fc | < 



(logA r ) c A r " 2 "^ , using (f6790|) . we obtain (|6.103j) and complete the proof. □ 
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